Skip to content

Conversation

arrdem
Copy link
Collaborator

@arrdem arrdem commented Sep 25, 2025

Design doc

An implementation of pip based on consuming PEP-751 [1] like lockfiles. Specifically uv lockfiles, which contain internal dependency graph information that the PEP-751 specification labels optional.

Follows in the footsteps of rules_js's pnpm support by consuming a lockfile which contains enough information to produce materialize dependencies without performing any repository time operations which could be platform dependent.

Features

  • Supports cross-platform builds of wheels
  • Supports hermetic source builds of wheels
  • Automatically handles dependency cycles
  • Creates unified pip hubs which span virtualenv/dependency solution boundaries
  • Pip library targets from deactivated venvs are incompatible
  • Platform constraints on pip libraries do not prevent the creation of a library target

Example

# MODULE.bazel content
pip = use_extensioin("@aspect_rules_py//pip:extesion.bzl", "pip")
pip.declare_hub(hub_name = "pip")
    
pip.declare_venv(hub_name = "pip", venv_name = "a")
pip.lockfile(hub_name = "pip", venv_name = "a", lockfile = "third_party/py/venvs/pylock-a.toml")

pip.declare_venv(hub_name = "pip", venv_name = "b")
pip.lockfile(hub_name = "pip", venv_name = "b", lockfile = "third_party/py/venvs/pylock-b.toml")

use_repo(pip, "pip")
# BUILD.bazel content

py_venv_binary(
  name = "foo",
  srcs = [
    "foo.py",
  ],
  main = "foo.py",
  deps = [
    "@pip//cowsay", # Pull cowsay from the configured venv
  ],
  venv = "a", # Configure the default venv to be "a"; may be overriden at the CLI
)

The active venv state can be overriden at the cli by specifying --'@pip//venv=b' here for instance, or by using transitions to(re) set that same flag.

Appendix

[1] https://peps.python.org/pep-0751/
[2] https://peps.python.org/pep-0751/#locking-build-requirements-for-sdists

Changes are visible to end-users: yes

  • Searched for relevant documentation and updated as needed: yes/no
  • Breaking change (forces users to change their own code or config): yes/no
  • Suggested release notes appear below: yes/no

To do list

  • Document the extension
  • Document that we consider the extension is unstable
  • Add a configuration option for forcing source builds
  • Add a transitionable configuration option for depending on pip deps as whl files not py_library targets for pex etc.
  • Add support for annotating replacement of pip deps with internal builds (--editable / vendoring)
  • Go back over the interpreter compatibility machinery and align it with rules_python's config settings for now
  • Go back over the interpreter feature flags and align it with rules_python's config settings for now fix(toolchains): correctly register musl/freethreaded toolchains for workspace bazel-contrib/rules_python#3314
  • Document a --editable-like workflow
  • Go over zbarsky's nits
  • Audit the comments/FIXMEs for accuracy
  • Flatten the git log
  • Look into platform conditional deps & how they get represented
  • Consider a feature flag to turn on rules_python's package name normalization so that migration is easier.
  • Match the rules_python @hub//package[:package] syntax?
  • Match the rules_rust @hub//:packge syntax?
  • Get a toml.bzl working
  • Toolchainize the uv dependency

Test plan

  • Manually test flipping the venv command line flag
  • Manually test flipping the venv transition attr
  • Create py_venv_tests covering that different versions of the same package can be concurrently configured via different venvs
  • Create a py_venv_test covering that Airflow or another package with dependency cycles can be provisioned
  • Create a py_venv_binary embedded in and transitioned for a Linux OCI container across arch boundaries

Copy link

aspect-workflows bot commented Sep 25, 2025

Test

⚠️ Buildkite build #464 failed.

//py/tests/py_venv_conflict:test_venv_ignore failed to build

Errors encountered while applying Starlark transition
Errors encountered while applying Starlark transition

//docs:update_5_test failed to build

in fail_with_message_test rule //docs:update_5_test:
Traceback (most recent call last):
	File
"/mnt/ephemeral/output/rules_py/__main__/external/aspect_bazel_lib/lib/private/fail_with_message_test.bzl",
line 4, column 9, in _fail_with_message_test_impl
		fail(ctx.attr.message)
Error in fail:
 
@//docs:unstable.md does not exist. To create & update this and other generated files, run:
 
    bazel run @//docs:update
 
To create an update *only* this file, run:
 
    bazel run //docs:update_5

@@pypi_setuptools//:whl failed to build

no such package '@@pypi_setuptools//': The repository '@@pypi_setuptools' could not be resolved: Repository
'@@pypi_setuptools' is not defined

@@pip2//click:click failed to build

no such package '@@pip2//click': The repository '@@pip2' could not be resolved: Repository '@@pip2' is not
defined

7 other actions failed to build.

Failed tests (14)
//:requirements_test [k8-fastbuild]                                                       🔗
//docs:update_6_test [k8-fastbuild]                                                       🔗
//examples/multi_version:py_version_default_test [k8-fastbuild]                           🔗
//examples/multi_version:py_version_test [k8-fastbuild-ST-494921797612]                   🔗
//py/tests/py-binary:runfiles_from_pip_test [k8-fastbuild]                                🔗
//py/tests/py-external-venv:test [k8-fastbuild-ST-4d16e4d42f67]                           🔗
//py/tests/py_image_layer:my_app_layers_test_test [k8-fastbuild]                          🔗
//py/tests/py_venv_conflict:validate_import_roots [k8-fastbuild-ST-02ecc3a0f3d8]          🔗
//py/tests/py_venv_image_layer:my_app_amd64_layers_test [k8-fastbuild]                    🔗
//py/tests/py_venv_image_layer:my_app_arm64_layers_test [k8-fastbuild]                    🔗
//py/tests/py_venv_image_layer:py_amd64_image_command_test [k8-fastbuild]                 🔗
//py/tests/py_venv_image_layer:py_amd64_image_content_test [k8-fastbuild]                 🔗
//py/tests/py_venv_image_layer:py_arm64_image_content_test [k8-fastbuild]                 🔗
//py/tests/virtual/django:requirements_test [k8-fastbuild]                                🔗

💡 To reproduce the build failures, run

bazel build //py/tests/py_venv_conflict:test_venv_ignore //docs:update_5_test @@pypi_setuptools//:whl @@pip2//click:click

💡 To reproduce the test failures, run

bazel test //py/tests/py_venv_image_layer:py_arm64_image_content_test //py/tests/py_venv_image_layer:py_amd64_image_command_test //py/tests/py-external-venv:test //py/tests/py_venv_image_layer:my_app_amd64_layers_test //py/tests/py-binary:runfiles_from_pip_test //py/tests/py_venv_image_layer:py_amd64_image_content_test //py/tests/virtual/django:requirements_test //examples/multi_version:py_version_default_test //examples/multi_version:py_version_test //py/tests/py_venv_conflict:validate_import_roots //:requirements_test //py/tests/py_image_layer:my_app_layers_test_test //docs:update_6_test //py/tests/py_venv_image_layer:my_app_arm64_layers_test


for minor in MINORS:
selects.config_setting_group(
name = "is_{}{}{}".format(interpreter, major, minor),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally %s is faster than .format because it avoids a function call and has simpler (faster) arg conversion. Also if you have only a single substitution it helps even more because you don't even need to construct a tuple. Up to you where to use it, but at least for the functions in the extension that are called repeatedly I would strongly consider it. (Honestly I pretty much always use it unless I have a complex/multiline template with 4+ args and/or using an arg multiple times)

Copy link
Contributor

@dzbarsky dzbarsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you said you love nits...i left a bunch of nits. feel free to apply or disregard as you see fit!

# We loop up to the second-to-last item to ensure we always have a 'next' stage.
for i, current_stage in enumerate(stages):
selects.config_setting_group(
name = "{}".format(current_stage.name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name = current_stage.name :)

selects.config_setting_group(
name = "{}".format(current_stage.name),
match_any = [
":{}".format(current_stage.condition),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

":" + current_Stage.condition or ":%s" % current_stage.condition are both faster and less chars :)

"""
selects.config_setting_group(
name = "{}",
match_all = {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be a bit clearer to put the brackets in the expression here, especially since I think the way you did it I think the trailing ] isn't indented correctly? Here's a similar thing I did with build_deps here

# Collect all hubs, ensure we have no dupes
for mod in module_ctx.modules:
for hub in mod.tags.declare_hub:
hub_specs.setdefault(hub.hub_name, {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you use setdefault like this a bunch, I think you can tweak the pattern slightly to avoid repeated lookups:

hub_specs.setdefault(hub.hub_name, {})[mod.name] = 1

problems = []
for hub_name, modules in hub_specs.items():
if len(modules.keys()) > 1:
problems.append(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could combine this with the previous loop and make hub_specs store hub_name -> mod_name(just check if you already have an entry and add toproblems) instead of of hub_name -> mod_name -> 1`. I guess it would make it harder to handle 3 modules all using the same name, but meh? It would probably lead to simpler usage in the rest of the extension

load("//pip/private/constraints/platform:defs.bzl", "supported_platform")
load(":parse_whl_name.bzl", "parse_whl_name")

def format_arms(d):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load(":parse_whl_name.bzl", "parse_whl_name")

def format_arms(d):
content = [" \"{}\": \"{}\"".format(k, v) for k, v in d.items()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clearer to use single quotes on the outside and/or repr the k/v to avoid inner ones?

def sort_select_arms(arms):
# {(python, platform, abi): target}
pairs = list(arms.items())
pairs = sorted(pairs, key=select_key, reverse=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pass arms.items() directly into sorted and skip the copy?

return [
# FIXME: Need to generate PyInfo here
DefaultInfo(
files = depset([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess you could create this depset once and reuse it below

@arrdem arrdem marked this pull request as ready for review October 3, 2025 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants